For graph diagrams you need nodes and edges. Let’s use the create_node_df() function to specify a collection of nodes and contain them in a data frame (a node data frame, or ndf). Immediately after that, inspect the node data frame.
ndf <-
create_node_df(
n = 3,
label = TRUE)
ndf
## id type label
## 1 1 <NA> 1
## 2 2 <NA> 2
## 3 3 <NA> 3
The n argument is required here and it must indicate the number of nodes you intend this object to contain. The use of label = TRUE allows for copying of the node IDs as the node label (which is a node attribute). This is not always desirable, however. A better option is to specify a vector of text labels (you can use all manner of characters), but ensure that this vector is the same length as specified by n. If you ensure that the label node attribute always contains unique values, later on you can select individual nodes by label and perform actions on them.
You may have noticed the node attribute type in the output. Values may optionally be provided for this such that the values are useful for categorizing collections of nodes. Let’s refine the ndf object and include two different type values.
ndf <-
create_node_df(
n = 3,
type = c("A", "A", "B"),
label = TRUE)
ndf
## id type label
## 1 1 A 1
## 2 2 A 2
## 3 3 B 3
Now onto the edges, those connections between the nodes. The edge connections are also collected in a data frame (this time, an edge data frame or edf). The create_edge_df() function is used to generate this object.
edf <-
create_edge_df(
from = c(1, 1),
to = c(2, 3))
edf
## id from to rel
## 1 1 1 2 <NA>
## 2 2 1 3 <NA>
The from and to arguments specify which nodes for the edge are outgoing and incoming, respectively. Here, the edges are: 1->2 and 1->3. For directed graphs, the order is essential. The rel argument allows for inclusion of text labels in the same manner as the node type. This is useful for targeting specific groups of edges during a selection or traversal. Let’s refine the edf object and include two different rel values.
edf <-
create_edge_df(
from = c(1, 1),
to = c(2, 3),
rel = c("X", "Y"))
edf
## id from to rel
## 1 1 1 2 X
## 2 2 1 3 Y
Now that we have an ndf and an edf, we can now combine those into a graph object.
the_graph <-
create_graph(
nodes_df = ndf,
edges_df = edf)
Having a graph object is a great first step and you can do a lot with them (inspect, modify, join with other graph objects, etc.). Whenever you’d like to view the graph, use the render_graph() function. There are three renderers for graphs:
### Graphviz
# Show the graph using the Graphviz engine
render_graph(graph = the_graph, output = "graph")
# Show the graph using the visNetwork engine
render_graph(graph = the_graph, output = "visNetwork")
You don’t really need to have a node data frame in conjunction with the edge data frame to make a graph object. You can have both, either of them, or neither of them (creating an empty graph). Here is the empty graph case:
# Render a just created graph; there aren't
# any nodes for the view should be empty
render_graph(
create_graph())
We can just display nodes without edges by passing in just a node data frame to the create_graph() function. This a reasonable way to initialize a graph and edges between these initial nodes can always be added in later with functions like add_edges(), add_edges_w_string(), add_edge_df(), and add_edges_from_table().
# Create the node data frame
ndf <-
create_node_df(
n = 3,
label = c("A", "B", "C"))
# Use the `create_graph()` function
the_graph <- create_graph(nodes_df = ndf)
# View the graph
render_graph(graph = the_graph)
Here is an example with 26 nodes, this time wrapping the create_nodes(), create_graph(), and render_graph() function calls into one big statement. We can provide exactly 26 capital letters as node labels by providing the built-in LETTERS vector to label.
render_graph(
create_graph(
nodes_df =
create_node_df(
n = 26,
label = LETTERS)))
Unicode characters can also be used as label attribute values:
render_graph(
create_graph(
nodes_df =
create_node_df(
n = 3,
label = c("α", "β", "γ"))))
Nesting calls like this becomes less readable with increasing numbers of operations. DiagrammeR works very well with the magrittr package to allow forward-piping of graph operations with the %>% operator. Here is the same set of functions as above, except using %>% to chain the operations left to right:
create_node_df(
n = 3, label = c("α", "β", "γ")) %>%
create_graph(nodes_df = .) %>%
render_graph()
The . character in the create_graph() statement indicates that the returned output from the previous create_node_df() statement should go here. Otherwise, the default behavior is that the output from a statement becomes the value for the first argument (which is often omitted entirely). The functions in DiagrammeR are designed to have this forward-pipe pattern work well. With further examples, you’ll see how this can make even complicated graph operations easier to compose and to read.